small target
A methodology for clinically driven interactive segmentation evaluation
Esmaeili, Parhom, Fernandez, Virginia, Borges, Pedro, Gibson, Eli, Ourselin, Sebastien, Cardoso, M. Jorge
Interactive segmentation is a promising strategy for building robust, generalisable algorithms for volumetric medical image segmentation. However, inconsistent and clinically unrealistic evaluation hinders fair comparison and misrepresents real-world performance. We propose a clinically grounded methodology for defining evaluation tasks and metrics, and built a software framework for constructing standardised evaluation pipelines. We evaluate state-of-the-art algorithms across heterogeneous and complex tasks and observe that (i) minimising information loss when processing user interactions is critical for model robustness, (ii) adaptive-zooming mechanisms boost robustness and speed convergence, (iii) performance drops if validation prompting behaviour/budgets differ from training, (iv) 2D methods perform well with slab-like images and coarse targets, but 3D context helps with large or irregularly shaped targets, (v) performance of non-medical-domain models (e.g. SAM2) degrades with poor contrast and complex shapes.
Appendix A Object Query Generation
The text-guided object detection network, as described in Section 3.1.1, As mentioned in Section 3.1.2, The computation of these spatial relation features is explained in detail below. The orientation between two objects is represented by encoding the angle values of the line that connects their centers in the spherical coordinate system. The above calculation results are combined as the spatial relation features of "Distance & Orientation".
DASSF: Dynamic-Attention Scale-Sequence Fusion for Aerial Object Detection
The detection of small objects in aerial images is a fundamental task in the field of computer vision. Moving objects in aerial photography have problems such as different shapes and sizes, dense overlap, occlusion by the background, and object blur, however, the original YOLO algorithm has low overall detection accuracy due to its weak ability to perceive targets of different scales. In order to improve the detection accuracy of densely overlapping small targets and fuzzy targets, this paper proposes a dynamic-attention scale-sequence fusion algorithm (DASSF) for small target detection in aerial images. First, we propose a dynamic scale sequence feature fusion (DSSFF) module that improves the up-sampling mechanism and reduces computational load. Secondly, a x-small object detection head is specially added to enhance the detection capability of small targets. Finally, in order to improve the expressive ability of targets of different types and sizes, we use the dynamic head (DyHead). The model we proposed solves the problem of small target detection in aerial images and can be applied to multiple different versions of the YOLO algorithm, which is universal. Experimental results show that when the DASSF method is applied to YOLOv8, compared to YOLOv8n, on the VisDrone-2019 and DIOR datasets, the model shows an increase of 9.2% and 2.4% in the mean average precision (mAP), respectively, and outperforms the current mainstream methods.
Small and Dim Target Detection in IR Imagery: A Review
Kumar, Nikhil, Singh, Pravendra
While there has been significant progress in object detection using conventional image processing and machine learning algorithms, exploring small and dim target detection in the IR domain is a relatively new area of study. The majority of small and dim target detection methods are derived from conventional object detection algorithms, albeit with some alterations. The task of detecting small and dim targets in IR imagery is complex. This is because these targets often need distinct features, the background is cluttered with unclear details, and the IR signatures of the scene can change over time due to fluctuations in thermodynamics. The primary objective of this review is to highlight the progress made in this field. This is the first review in the field of small and dim target detection in infrared imagery, encompassing various methodologies ranging from conventional image processing to cutting-edge deep learning-based approaches. The authors have also introduced a taxonomy of such approaches. There are two main types of approaches: methodologies using several frames for detection, and single-frame-based detection techniques. Single frame-based detection techniques encompass a diverse range of methods, spanning from traditional image processing-based approaches to more advanced deep learning methodologies. Our findings indicate that deep learning approaches perform better than traditional image processing-based approaches. In addition, a comprehensive compilation of various available datasets has also been provided. Furthermore, this review identifies the gaps and limitations in existing techniques, paving the way for future research and development in this area.
Improvement and Enhancement of YOLOv5 Small Target Recognition Based on Multi-module Optimization
Li, Qingyang, Li, Yuchen, Duan, Hongyi, Kang, JiaLiang, Zhang, Jianan, Gan, Xueqian, Xu, Ruotong
In this paper, the limitations of YOLOv5s model on small target detection task are deeply studied and improved. The performance of the model is successfully enhanced by introducing GhostNet-based convolutional module, RepGFPN-based Neck module optimization, CA and Transformer's attention mechanism, and loss function improvement using NWD. The experimental results validate the positive impact of these improvement strategies on model precision, recall and mAP. In particular, the improved model shows significant superiority in dealing with complex backgrounds and tiny targets in real-world application tests. This study provides an effective optimization strategy for the YOLOv5s model on small target detection, and lays a solid foundation for future related research and applications.
iSmallNet: Densely Nested Network with Label Decoupling for Infrared Small Target Detection
Hu, Zhiheng, Wang, Yongzhen, Li, Peng, Qin, Jie, Xie, Haoran, Wei, Mingqiang
Small targets are often submerged in cluttered backgrounds of infrared images. Conventional detectors tend to generate false alarms, while CNN-based detectors lose small targets in deep layers. To this end, we propose iSmallNet, a multi-stream densely nested network with label decoupling for infrared small object detection. On the one hand, to fully exploit the shape information of small targets, we decouple the original labeled ground-truth (GT) map into an interior map and a boundary one. The GT map, in collaboration with the two additional maps, tackles the unbalanced distribution of small object boundaries. On the other hand, two key modules are delicately designed and incorporated into the proposed network to boost the overall performance. First, to maintain small targets in deep layers, we develop a multi-scale nested interaction module to explore a wide range of context information. Second, we develop an interior-boundary fusion module to integrate multi-granularity information. Experiments on NUAA-SIRST and NUDT-SIRST clearly show the superiority of iSmallNet over 11 state-of-the-art detectors.
A Bioinspired Retinal Neural Network for Accurately Extracting Small-Target Motion Information in Cluttered Backgrounds
Huang, Xiao, Qiao, Hong, Li, Hui, Jiang, Zhihong
Robust and accurate detection of small moving targets in cluttered moving backgrounds is a significant and challenging problem for robotic visual systems to perform search and tracking tasks. Inspired by the neural circuitry of elementary motion vision in the mammalian retina, this paper proposes a bioinspired retinal neural network based on a new neurodynamics-based temporal filtering and multiform 2-D spatial Gabor filtering. This model can estimate motion direction accurately via only two perpendicular spatiotemporal filtering signals, and respond to small targets of different sizes and velocities by adjusting the dendrite field size of the spatial filter. Meanwhile, an algorithm of directionally selective inhibition is proposed to suppress the target-like features in the moving background, which can reduce the influence of background motion effectively. Extensive synthetic and real-data experiments show that the proposed model works stably for small targets of a wider size and velocity range, and has better detection performance than other bioinspired models. Additionally, it can also extract the information of motion direction and motion energy accurately and rapidly.
Big Data, Small Target: The Smart Approach To Artificial Intelligence
Companies that have invested heavily in big data solutions want to know how to make smart, strategic investments that will distinguish them from the competition and enable the best possible return before making the decision to go all in. In the past, not all enterprise big data initiatives went as planned. These failures are not usually published, but the big data failure rate is unusually high. According to Gartner, only 15% of businesses make it past the pilot stage of these projects. Our fear, as leaders of technology companies, is that with so much attention surrounding AI, the pressure is on to apply the technology or risk falling behind the many decision makers who are adopting technologies without first establishing clear business goals and understanding the differences between AI and ML and how they should be applied. It's easy to get caught up in the allure of artificial intelligence as well as its hype, including breakthroughs like deep learning, but those looking to make an outsized impact should instead focus on its more practical counterpart: good old-fashioned machine learning -- or "cheap learning," as my colleague Ted Dunning and Ellen Friedman explain in their guide Practical Machine Learning: Innovations in Recommendation.
Big Data, Small Target: The Smart Approach To Artificial Intelligence
This is why companies like Google (a leading investor in our company), Amazon, Facebook, Alibaba and Baidu are so powerful from an AI perspective. These companies have enormous data sets that they've been capturing for decades across a wide variety of trended patterns. This data has fed into their algorithms for years, making them increasingly more refined, accurate and targeted. For most enterprise companies, the big challenge is that it's not always clear at the time data is collected what's going to matter down the road. This makes it hard to know what to measure today and if that measurement will be valuable in the future.